Tweet-biased summarization
نویسندگان
چکیده
We examined if the micro-blog comments given by people after reading a web document could be exploited to improve the accuracy of a web document summarization system. We examined the effect of social information (i.e. tweets) on the accuracy of the generated summaries by comparing the user preference of TBS (TweetBiased Summary) with GS (Generic Summary) in a crowdsourcing-based evaluation. Comparing TBS with two different GS baselines, we found that the user preference for TBS was significantly higher than GS. We also took random samples of the documents to see the performance of summaries in a traditional evaluation using ROUGE, which in general TBS was also shown to be better than GS. We further analysed the influence of the number of tweets pointed to a web document on summarization accuracy, finding a positive moderate correlation between the number of tweets pointed to a web document and the performance of generated TBS as measured by user preference. The results show that incorporating social information into summary generation process can improve the accuracy of summary. The reason of people choosing one summary over another in a crowdsourcing-based evaluation also presented in this paper.
منابع مشابه
Topic Evolutionary Tweet Stream Clustering Algorithm and TCV Rank Summarization
Tweet are being created short text message and shared for both users and data analysts. Twitter which receive over 400 million tweets per day has emerged as an invaluable source of news, blogs, opinions and more. our proposed work consists three components tweet stream clustering to cluster tweet using k-means cluster algorithm and second tweet cluster vector technique to generate rank summariz...
متن کاملTweet Contextualization using Continuous Space Vectors: Automatic Summarization of Cultural Documents
In this paper we describe our participation in the INEX 2016 Tweet Contextualization track. The tweet contextualization process aims at generating a short summary from Wikipedia documents related to the tweet. In our approach, we analyzed tweets and created a query to retrieve the most relevant Wikipedia article. We combine Information Retrieval and Automatic Text Summarization methods to gener...
متن کاملA Hybrid Tweet Contextualization System using IR and Summarization
The article presents the experiments carried out as part of the participation in the Tweet Contextualization (TC) track of INEX 2012. We have submitted three runs. The INEX TC task has two main sub tasks, Focused IR and Automatic Summarization. In the Focused IR system, we first preprocess the Wikipedia documents and then index them using Nutch with NE field. Stop words are removed and all NEs ...
متن کاملUltra-stemming and Statistical Summarization at INEX 2013 Tweet Contextualization Track
According to the organizers, the objective of the 2013 INEX Tweet Contextualization Task is: “...The Tweet Contextualization aims at providing automatically information a summary that explains the tweet. This requires combining multiple types of processing from information retrieval to multi-document summarization including entity linking.” We present the Cortex summarizer applied to the INEX 2...
متن کاملA Pipeline Tweet Contextualization System at INEX 2013
This article describes a pipeline system and preliminary results for Tweet Contextualization at INEX 2013. The system consists of three steps: tweet analysis, passage retrieval and summarization. For each tweet, key phrases are first extracted by making use of ArkTweet toolkit and employing several heuristics. They are then submitted as queries to Indri search engine to retrieve relevant passag...
متن کاملTweet Contextualization (Answering Tweet Question) - the Role of Multi-document Summarization
The article presents the experiments carried out as part of the participation in the Tweet Contextualization (TC) track of INEX 2013. In our system there are three major sub-systems; i) Offline multi-document summarization, ii) Focused IR and iii) online multi-document Summarization. The Offline multi-document summarization system is based on document graph, clustering and sentence compression....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- JASIST
دوره 67 شماره
صفحات -
تاریخ انتشار 2016